SWAT is the open-source Python interface to SAS’ cloud-based, fault-tolerant, in-memory analytics server.

  • Connects to CAS using binary (currently Linux only) or REST interface
  • Calls CAS analytic actions and returns results in Python objects
  • Implements a Pandas API that calls CAS actions in the background

How Does it Work?


In [ ]:
import swat

In [ ]:
conn = swat.CAS('cas01', 49786)

Calling CAS Actions


In [ ]:
conn.serverstatus()

In [ ]:
conn.userinfo()

In [ ]:
conn.help();

Loading Data


In [ ]:
tbl2 = conn.read_csv('https://raw.githubusercontent.com/'
                    'sassoftware/sas-viya-programming/master/data/cars.csv', 
                     casout=conn.CASTable('cars'))
tbl2

CASTable


CASTable objects contain a reference to a CAS table as well as filtering and grouping options, and computed columns.


In [ ]:
conn.tableinfo()

In [ ]:
tbl = conn.CASTable('attrition')

In [ ]:
tbl.columninfo()

In [ ]:
tbl2?

In [ ]:
tbl2.fetch()

Exploring Data


In [ ]:
tbl.summary()

In [ ]:
tbl.freq(inputs='Attrition')

Building Analytical Models


In [ ]:
conn.loadactionset('regression')
conn.help(actionset='regression');

In [ ]:
output = tbl.logistic(
    target='Attrition',
    inputs=['Gender', 'MaritalStatus', 'AccountAge'],
    nominals = ['Gender', 'MaritalStatus']
)

In [ ]:
output.keys()

In [ ]:
output

In [ ]:
from swat.render import render_html
render_html(output)

Pandas-style DataFrame API


Many Pandas DataFrame features are available on the CASTable objects.


In [ ]:
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/'
                 'sassoftware/sas-viya-programming/master/data/cars.csv')
df.describe()

In [ ]:
tbl2.describe()

In [ ]:
tbl2.groupby('Origin').describe()

In [ ]:
tbl[['Gender', 'AccountAge']].head()

Visualization


In [ ]:
from bokeh.plotting import show, figure
from bokeh.charts import Bar
from bokeh.io import output_notebook
output_notebook()

In [ ]:
output1 = tbl.freq(inputs=['Attrition'])

p = Bar(output1['Frequency'], 'FmtVar', 
        values='Frequency',
        color="#1f77b4", 
        agg='mean', 
        title="", 
        xlabel = "Attrition",
        ylabel = 'Frequency',        
        bar_width=0.8,
        plot_width=600, 
        plot_height=400 
)
show(p)


In [ ]:
conn.tableinfo()

In [ ]:
tbl2.groupby(['Origin', 'Type']).describe()

In [ ]:
tbl2[['MPG_CITY', 'MPG_Highway', 'MSRP']].describe()

In [ ]:
tbl2[(tbl2.MSRP > 90000) & (tbl2.Cylinders < 12)].head()


In [ ]:
conn.runcode(code='''
    data cars_temp;
        set cars;
        sqrt_MSRP = sqrt(MSRP);
        MPG_avg = (MPG_city + MPG_highway) / 2;
    run;
''')

In [ ]:
conn.tableinfo()

In [ ]:
conn.loadactionset('fedsql')

conn.fedsql.execdirect(query='''
    select make, model, msrp,
    mpg_highway from cars
        where msrp > 80000 and mpg_highway > 20
''')